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ANTIMICRQBIAL-PROTEIN-PRODUCING 
ENDQSYMBIOTIC MICRO-ORGANISMS 



This invention relates to endosymbi o t i c 
micro-organisms having the ability to produce 
plant-de rived antimicrobial proteins. 

In this context, 'antimicrobial' proteins are 
defined as proteins possessing at least one of the 
following activities: antifungal activity (which may 
include anti-yeast activity); antibacterial 
activity. Activity includes a range of antagonistic 
effects resulting in partial inhibition or death. 
'Plant-derived' proteins are capable of being 
isolated from the seed or other parts of one or more 
plant species . 

Various proteins with antimicrobial activity 
have been isolated from plant sources, and such 
proteins are often believed to take part in host 
defence mechanisms directed against invading or 
competing micro-organisms. Some of the proteins are 
well-characterised, and their amino acid sequence 
may be known. In some cases, the cDNA or gene 
encoding the protein has also been isolated and 
sequenced . 



To keep out potential invaders, plants produce 
a wide array of antifungal compounds, either in a 
constitutive or an inducible manner. Several 
classes of proteins with antifungal properties have 
now been identified, including chitinases, 
beta-l,3-glucanases, r i bo some- inactiva ting proteins, 
thionins, chi tin-binding lectins and zeamatins. 
These proteins have gained considerable attention as 
they could potentially be used as biocontrol agents. 



The chitinases ( Schlumbaum et al, 1986, Nature, 324, 
363-367) and be ta-1 , 3-glucanases have weak 
activities by themselves, and are only inhibitory to 
plant pathogens when applied in combination (Mauch 
et al, 1988, Plant Physiol, 88, 936-942). The 
chi tin-binding lectins can also be classified as 
rather weak antifungal factors (Broekaert et al , 
1989, Science, 245, 1100-1102; Van Parijs et al, 
1991, Planta, 183, 258-264). Zeamatin is a more 
potent antifungal protein but its activity is 
strongly reduced by the presence of ions at 
physiological concentrations (Roberts and 
Selitnermikof f , 1990, G Gen Microbiol, 136, 
2150-2155). Permatins are also known plant 
antifungal proteins (Vigers et al, 1991, Molec 
Plant-Microbe Interact, 4, 315-323; Woloshuk et al, 
1991, Plant Cell, 3, 619-628), Finally, thionins 
(Apel et al, 1990, Physiol Plant, 80, 315-321) and 
r ibosome-inact ivating proteins (Roberts and 
Seli trennikof f , 1986, Biosci Rep, 6, 19-29; Leah et 
al, 1991, J Biol Chem, 266, 1564-1573) have 
antifungal activity and are known to be toxic for 
human cells (Carrasco et al, 1981, Eur J Biochem, 
116, 185-189; Vernon et al, 1985, Arch Biochem 
Biophys, 238, 18-29; Stirpe and Barbieri, 1986, FEES 
Lett, 195, 1-8). 

Other groups of potent antimicrobial proteins 
with broad spectrum activity against plant 
pathogenic fungi (and often some antibacterial 
activity) are capable of isolation from certain 
plant species. We have previously described the 
structural and antifungal properties of several such 
proteins, including: 

the small-sized cys teine-rich proteins Mj-AMPl 
(antimicrobial protein 1) and Mj-AMP2 occurring in 



seeds of Mi rabilis j alapa ( Cammue BPA et al, 1992, j 
Biol Chem, 267:2228-2233; International Application 
Publication ^3uInber W092/15691 published on 17 
September 1992 ) ; 

Ac-AMPl and AC-AMP2 from Amaranthus caudatus 
seeds (Broekaert WF et al, 1992, Biochemistry, 
37:4308-4314; International Application Publication 
Number W092/21699 published on 10 December 1992); 

Ca-AMPl from Capsicum annuum , Bm-AMPl from 
Br i za maxima and related proteins found in other 
plants including Delphinium , Catapodium , Baptisia 
and Microsensi s species (International Patent 
Application Number PCT/GB93/02179 filed on 22 
October 1993 ) ; 

Rs-AFPl (antifungal protein 1) and Rs-AFP2 from 
seeds of Raphanus sativus (Terras FRG et al , 1992, J 
Biol Chem, 267:15301-13309) and related proteins 
such as Bn-AFPl and Bn-AFP2 from Brassica napus , 
Br-AFPl and Br-AFP2 from Brassica rapa , Sa-AFPl and 
Sa-AFP2 from Sinapis alba , At-AFPl from Arabidopsis 
thaliana , Dm-AMPl and Dm-AMP2 from Dahlia mercki i , 
Cb-AMPl and Cb-AMP2 from Cni cus benedictus , Lc-AFP 
from Lathyrus cicera , Ct-AMPl and Ct-AMP2 from 
Cli tor ia ternatea (International Patent Application 
Publication Number WO93/05153 published 18 March 
1993 ) ; 

Rs-nsLTP (non-specific lipid transfer protein) 
from Raphanus sativus (International Patent 
Application Publication Number WO93/05153 published 
18 March 1993 ) . 

These publications are specifically incorporated 
herein by reference. 

These and other plant-derived antimicrobial 
proteins are useful as fungicides or antibiotics to 
improve the di sease- res i stance or disease-tolerance 
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of crops either during the life of the plant or for 
post-harvest crop protection. The proteins may be 
extracted from plant tissue or produced by 
expression within micro-organisms. Exposure of a 
plant pathogen to an antimicrobial protein may be 
achieved by application of the protein to plant 
parts using standard agricultural techniques (eg 
surface spraying). The proteins may also be used to 
combat fungal or bacterial disease by expression 
within plant bodies (rather than just at the 
surface), DNA encoding the antimicrobial proteins 
(which may be a cDNA clone, a genomic DNA clone or 
DNA manufactured using a standard nucleic acid 
synthesiser) may be transformed into a plant, and 
the proteins expressed within transgenic plants. 

It is an object of the present invention to 
provide an alternative method to deliver the 
plant-derived antimicrobial protein to its desired 
site of action. Such a method should be generally 
applicable to a wide range of plant species and may 
be easier or more effective than other methods. 

Certain micro-organisms have the ability to 
enter into non-pathogenic endosymbioti c 
relationships with a plant host. These 
naturally-occurring micro-organisms, hereinafter 
called 'endophytes' , are capable of infecting the 
plant host and being harboured within the plant but 
create no visible manifestations of disease. Such 
organisms include mutualistic and commensalistic 
endophytic organisms. The range of endophytes also 
includes organisms which can exist in the vascular 
tissues of the plant and organisms which can exist 
within the intercellular spaces of the plant. 



5 

A method of endophy t e-enhanced protection of 
plants has been described in a series of patent 
applications by Crop Genetics International 
Corporation, which are discussed below and 
incorporated specifically herein by reference. 

International Application Publication Number 
WO90/13224 (published 15 November 1990) describes 
the introduction of an endophytic bacterium into a 
commercially-valuable plant (such as tobacco, 
potato , muskmelon ) to enhance protection against 
disease (such as tobacco mosaic virus (TMV), 
Pseudomonas syringae pv . tabaci , Clavibacter 
michiganese subsp. michiganese , potato virus X and 
Y, Fusarium sp. and other vascular wilt fungi). The 
endophyte is preferably Clavibacter xyli subsp, 
cynodontis (Cxc). The endophyte may be introduced 
into the plant by several methods including 
impregnating the seed with a suspension of the 
endophyte, using a seed coating, injecting the 
plant, and using a soil or foliar drench. 

The endophyte may be unmodified, genetically 
modified (as discussed below) or formulated with 
other components to provide additional beneficial 
prope rties . 

The endophyte may be genetically modified to 
produce agricultural chemicals. In this case, 
genetic material is derived from an agricultural- 
chemical-producing micro-organism and combined with 
a suitable endophyte. Combination of genetic 
material is achieved by: 

(a) forming a fusion hybrid between an endophytic 
bacterium and an agricultural-chemical- 
producing bacterium (European Patent 



Publication Number EP-125468-B1 , published 28 
Octobe r 1992 ) ; or 
(b) the use of recombinant techniques (insertion of 
DNA encoding an agricultural chemical); for 
example, transforming the endophyte with an 
expression vector which directs production of 
an agricultural chemical (International 
Application Publication Number WO91/10363, 
published 24 July 1991 and International 
Application Publication Number WO87/03303, 
published 4 June 1987). 
Use of the modified endophyte can improve the 
disease tolerance of a plant host (when compared to 
direct application of the agrochemical or 
agrochemical-producing-bacter ium) . The endophyte 
may be further improved by additional genetic 
modification using natural or artificial techniques 
(such as mutagenesis). For example, the endophyte 
may be modified to excrete the agricultural chemical 
in a particular form. 

The source of DNA encoding the agricultural 
chemical is a suitable micro-organism. Such 
agricultural- chemical-producing mi cro-o r gani sms are 
described in Table I (page 27) of International 
Application Publication Number WO91/10363 and 
include a wide variety of micro-organisms producing 
antibiotics, antifungal agents, antibacterial 
agents, antiviral agents, insecticides, nematocides, 
miticides, herbicides, fertilisers (nitrogen- fixing 
or phosphate solubilising agents), plant growth 
regulators or anti-feeding agents. 

Suitable endophytes include Agrobac te r ium 
tumef aciens , Erwinia carotovora , Pseudomonas 
solanacearum , Pseudomonas sy r ingae , Xanthomonas 
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campes t r i s , S t reptomyces ipomoea for dicotyledonous 
plants; Erwinia stevar ti i , Xanthomonas campestris , 
Azospi rillum lipof erum , Azospi rillum brasi lense , 
Pseudomonas sy r i ngae for monoco tyl edonous plants. 
Clavibacter xyli subsp, xyl i and Clavibacte r xyli 
subsp, cynodontis (Cxc) are particularly useful for 
grasses such as maize, sorghum and the like. 

The agr icul tural-chemi cal-p reducing endophytes 
may be used to enhance disease protection in any 
plant, including those producing fruit, vegetables 
and flowers, trees, field and row plants such as 
corn, sorghum, wheat, barley, oats, rice, brome 
grass, sugar cane, cotton, potatoes, tomatoes, 
cabbage, cauliflower, broccoli, melons, cucumbers. 

International Application Publication Number 
WO88/09114 (published 1 December 1988) describes 
plants colonised by beneficial endophytic 
micro-organisms obtained by germination of seeds 
impregnated with the endophytes. The endophyte may 
be a strain of the genus Clavibacter or Rhi zobium , 
and may be genetically modified to produce an 
agricultural chemical. The seed may be from the 
Gramineae, Leguminosae or Malvaceae family. 
International Application Publication Number 
WO91/11907 (published 22 August 1991) describes the 
production of modified seed (particularly rice) 
containing an unmodified or modified endophyte 
(particularly Cxc) to produce a plant of reduced 
stature . 

Crop Genetics International have already 
developed a corn bioinsecticide based upon this 
endophyte technology (trademark: INCIDE Technology). 
The INCIDE bioinsecticide consists of the endophyte 
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Clavibac te r xy 1 i subsp. cynodoni t s (Cxc) which has 
been genetically modified with an endotoxin gene 
derived from the bacterium Bacillus thur ingiensis , 
and thus expresses a protein which is toxic to 
certain insect larvae. If corn seed is inoculated 
with the INCIDE vaccine, the modified Cxc inhabits 
the vascular tissue of plants grown from this seed 
and the crop is protected from attack by cornborer 
larvae. However, there may be an associated yield 
reduction in certain crop species or varieties 
{Agrow, 13/11/92, no 172, p 6), 

European Patent Application Publication Number 
185005 (Monsanto Co, published 18 June 1986) also 
describes a "plant-colonizing micro-organism" 
(herein called an endophyte ) which has been 
genetically modified to express a B thuringiensis 
protein . 

When using an agricultural-chemical-producing 
endophyte to enhance disease protection in a plant, 
the source of DNA encoding the agricultural chemical 
is a suitable micro-organism. Plant-derived DNA 
sequences encoding antimicrobial proteins have not 
previously been used to modify the endophytes. 

To improve di sease- res i stance or 
di sease-tole ranee of crops, plant-derived 
antimicrobial proteins may be produced within the 
crop plant by expression of a gene incorported into 
the plant genome. This may involve over-expression 
of an inherent protein or expression of a protein 
derived from another plant species. We now provide 
the means to express the antimicrobial protein 
within the crop plant without requiring plant 
transformation. 



u 1 
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According to the invention, there is provided 
a method of producing an t imi c r obi a 1 -p r o te i n- 
producing micro-organisms capable of entering into 
endosymbio t i c relationships with a plant host 
comprising the combination of genetic material 
encoding a plant-derived antimicrobial protein with 
an endophyte . 

There is further provided antimicrobial- 
protein-producing micro-organisms produced according 
to the method of the invention, and seed and plants 
treated with said mi cro— organi sms . Antimicrobial 
protein may thus be expressed within the plant by an 
endophyte rather than being directly expressed by 
the host crop plant. 

As noted above, use of a genetically modified 
endophyte to deliver an agricultural chemical 
(including antifungal agents) has been described. 
However, the agricultural chemical was expressed 
from a gene derived from another micro-organism 
(usually a bacterium). Genes encoding plant -derived 
antimicrobial proteins have not been previously used 
(or suggested) to modify the endophyte. 

Examples of plants which may be protected using 
the antimicrobial-protein-producing micro-organisms 
include field crops, cereals, fruit and vegetables 
such as: canola, oil seed rape, sunflower, tobacco, 
sugarbeet, cotton, soya, maize, wheat, barley, rice, 
sorghum, tomatoes, mangoes, peaches, apples, pears, 
strawberries, bananas, melons, potatoes, carrot, 
lettuce, cabbage, onion. 



DNA encoding any plant-derived antimicrobial 
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protein may be used in the method according to the 
invention (for example, DNA encoding chitinases, 
hevein, lectins, thionins, etc). 

By way of example only, DNA encoding the 
following plant-derived antimicrobial proteins may 
be used in the method according to the invention: 
Mj-AMPl, Mj-AMP2, Ac-AMPl , AC-AMP2 , Ca-AMPl , 
Bm-AMPl, Rs-AFPl, RS-AFP2, Br-AFPl, Br-AFP2, 
Bn-AFPl, Bn-AFP2, Sa-AFPl, Sa-AFP2 , At-AFPl , 
Dm-AMPl, Dm-AMP2, Cb-AMPl , Cb-AMP2 , Lc-AFP, Ct-AMPl, 
Ct-AMP2, Rs-nsLTP. These proteins show a high level 
and wide spectrum of antifungal activity, and will 
be particularly useful for improving 
disease-resistance or disease-tolerance in crops. 
In particular, one or more of these potent 
antimicrobial proteins may be used in conjunction 
with a slower-growing endophyte as a relatively low 
dose of the highly active protein may be needed to 
provide disease protection. The presence of a 
slower-growing endophyte may result in less 
diversion of the host plant's metabolic resources, 
maintaining crop yield. In addition, use of these 
potent plant-derived antimicrobial proteins may 
extend the range of plant hosts most suitable as 
targets for this type of disease protection. Even 
endophytes which are relatively poor colonisers of 
certain plant species (such as Cxc on wheat) may be 
engineered to express one or more of the potent 
proteins to give the desired level of protection to 
the host plant. 

The invention will now be described by way of 
example only, with reference to the Sequence Listing 
in which: 

SEQ ID N0:1 is the amino acid sequence of 
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Mj-AMPl . 

SEQ ID NO : 2 is the amino acid 
M j-AWP2 . 

SEQ ID NO : 3 is the nucleotide 
M j-AMPl • 

SEQ ID NO : 4 is the amino acid 
Mj-AMPl deduced from SEQ ID NO : 3 . 

SEQ ID NO : 5 is the nucleotide 
M j-AMP2 . 

SEQ ID NO : 6 is the amino acid 
Mj-AMP2 deduced from SEQ ID NO : 5 , 

SEQ ID NO : 7 is the amino acid 
Ac-AMPl . 

SEQ ID N0:8 is the amino acid 
AC-AMP2 , 

SEQ ID N0:9 is the nucleotide 
AC-AMP2 . 

SEQ ID NO: 10 is the amino acid sequence of 
AC-AMP2 deduced from SED ID NO : 9 , 

SEQ ID NO: 11 is the amino acid sequence of Ca- 

AMPl . 

SEQ ID N0:12 is one possible predicted DNA 
sequence for the Ca-AMPl gene. 

SEQ ID NO:13 is the amino acid sequence of 
Bm-AMPl , 

SEQ ID NO:14 is one possible predicted DNA 
sequence for the Bm-AMPl gene. 

SEQ ID NO:15 is the amino acid sequence of 
Rs-AFPl . 

SEQ ID N0:16 is the amino acid sequence of 
RS-AFP2 . 

SEQ ID NO:17 is the amino acid sequence of 
Br-AFPl . 

SEQ ID N0:18 is the amino acid sequence of 
Br-AFP2 . 

SEQ ID N0:19 is the amino acid sequence of 



sequence of 
sequence of 
sequence of 
sequence of 
sequence of 
sequence of 
sequence of 
sequence of 
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Bn-AFPl . 

SEQ ID NO:20 is the amino acid sequence of 
Bn-AFP2 . 

SEQ ID N0:21 is the amino acid sequence of 
Sa-AFPl , 

SEQ ID NO:22 is the amino acid sequence of 
Sa-AFP2 . 

SEQ ID NO:23 is the amino acid sequence of 
At-AFPl . 

SEQ ID NO:24 is the amino acid sequence of 
Dm-AMPl . 

SEQ ID NO:25 is the amino acid sequence of 
Dm-AMP2 . 

SEQ ID NO:26 is the amino acid sequence of 
Cb-AMPl - 

SEQ ID NO:27 is the amino acid sequence of 
Cb-AMP2 . 

SEQ ID NO:28 is the amino acid sequence of 
LC-AFP . 

SEQ ID NO:29 is the amino acid sequence of 
Ct-AMPl . 

SEQ ID NO:30 is the amino acid sequence of 
Rs-nsLTF . 

SEQ ID NO : 31 is one possible predicted DNA 
sequence for the Dm-AMPl gene. 

SEQ ID NO:32 is one possible predicted DNA 
sequence for the Dm-AMP2 gene . 

SEQ ID NO: 33 is one possible predicted DNA 
sequence for the Cb— AMPl gene. 

SEQ ID NO:34 is one possible predicted DNA 
sequence for the Cb-AMP2 gene. 

SEQ ID NO:35 is one possible predicted DNA 
sequence for the Lc-AFP gene. 

SEQ ID NO:36 is one possible predicted DNA 
sequence for the Ct-AMPl gene. 

SEQ ID NO: 37 is the full length cDNA sequence 



X 
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of Rs-AFPl. 

SEQ ID NO:38 is the amino acid sequence of 
Rs-AFPl deduced from SEQ ID NO:37. 

SEQ ID NO:39 is the truncated cDNA sequence of 
RS-AFP2 . 

SEQ ID NO:40 is the amino acid sequence of 
RS-AFP2 deduced from SEQ ID NO:39. 

SEQ ID NO: 41 is the full length DNA sequence of 
PGR assisted site directed mutagenesis of RS-AFP2. 

SEQ ID NO:42 is the amino acid sequence of 
RS-AFP2 deduced from SEQ ID N0:41- 



EXAMPLE 1 

Expression of Raphanus sativus Antifungal 
Protein 2 (Rs-AFP2) by the endophyte Clavibacter 
xyl i subsp. cynodontis (Cxc). 

The RS-AFP2 protein is expressed in a system 
analogous to that which is known to express the 
Bacillus thur inqiensis endotoxin. An 
oligonucleotide sequence coding for the antifungal 
protein Rs-AFP2 is prepared using Cxc-compat ible 
codons. This oligonucleotide sequence comprises 
appropriate restriction sites to enable it to be 
exchanged with the Bacillus thur ingiensi s endotoxin 
gene sequence present in the INCIDE Cxc bacterium. 

Southern analysis is used to check that Cxc is 
transformed with the RS-AFP2 gene. If the result 
is positive, the bacterium is cultured to determine 
whether it is capable of expressing Rs-AFP2 protein 
in vitro. Western analysis and antifungal assays 
are carried out on the fermentation products to 
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determine whether the protein is produced in the 
correctly folded form as found in the native plant 
It is known that the protein loses antifungal 
activity when it is reduced and hence unfolded. 



EXAMPLE 2 
Protection of rice plants using 
Rs-AFP2-producing Cxc as an antifungal agent. 

Cultures of Cxc which are capable of 
expressing Rs-AFP2 protein are used to treat rice 
plants by a soil drench or seed treatment method. 

The rice plants are challenged with rice 
blast, Pyricula r ia o ryzae and assessed for 
increased resistance to the pathogen over non-Cxc- 
infected plants. RS-AFP2 is known to be active 
against P oryzae in iri vitro tests. 



»u y^/iou/o 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: ZENECA, Limited 

(ii) TITLE OF INVENTION: ANITMICROBIAL-PROTEIN-PRODUCING 
ENDOSYMBIOTIC MICRO-ORGANISMS 

(iii) NUMBER OF SEQUENCES: A2 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: ICI GROUP PATENTS SERVICES DEPT 

(B) STREET: PO BOX 6, SHIRE PARK. BESSEMER ROAD, 

(C) CITY: VELVYN GARDEN CITY 

(D) STATE: HERTFORDSHIRE 

(E) COUNTRY: UNITED KINGDOM 

(F) ZIP: AL7 IHD 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTVARE: PatentIn Release #1.0, Version #1.25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: GB 9300281,4 

(B) FILING DATE: 08-JAN-1993 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: ROBERTS, TIMOTHY V 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 44 707 323400 

(B) TELEFAX: 44 707 337454 

(C) TELEX: 94028500 ICIC G 



(2) INFORMATION FOR SEQ ID N0:1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



SUBSTITUTE SHEET 
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(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:1: 

Gin Cys lie Gly Asn Gly Gly Arg Cys Asn Glu Asn Val Gly Pro Pro 
15 10 15 

Tyr Cys Cys Ser Gly Phe Cys Leu Arg Gin Pro Gly Gin Gly Tyr Glv 
20 25 30 

Tyr Cys Lys Asn Arg 
35 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Cys He Gly Asn Gly Gly Arg Cys Asn Glu Asn Val Gly Pro Pro Tyr 
15 10 15 

Cys Cys Ser Gly Phe Cys Leu Arg Gin Pro Asn Gin Gly Tyr Gly Val 
20 25 30 

Cys Arg Asn Arg 
35 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 360 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
CTTCCCGTTG CCTTCCTCAA ATTCGCTATT GTGTTGATTC TCTTCATTGC CATGTCCGCA 60 
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ATGATAGAAG CACAATGCAT AGGAAATGGA GGAAGATGTA ACGAGAACGT GGGGCCACCA 120 

TACTGCTGCT CCGGTTTCTG CCTCCGTCAA CCTGGACAAG GTTATGGATA TTGTAAGAAC 180 

CGCTGAGCAA GAGCATGAAA GCAAGGCCAA TGTGTGGTCT ACTAATTTAG CCTCAAATGT 240 

TATTTATTTG CATGTCTTGT GTTTCTTAAT TACCTTCTTT GTGTCTAAGA AGGTATAGAT 300 

CAATAGTTTC TACTTTACTA CTATGAATAA GAGGCTTTGA TTTGGTTTAA AAAAAAAAAA 360 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 61 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Leu Pro Val Ala Phe Leu Lys Phe Ala He Val Leu He Leu Phe He 
15 10 15 

Ala Met Ser Ala Met He Glu Ala Gin Cys He Gly Asn Gly Gly Arff 
20 25 30 

Cys Asn Glu Asn Val Gly Pro Pro Tyr Cys Cys Ser Gly Phe Cys Leu 
35 40 45 

Arg Gin Pro Gly Gin Gly Tyr Gly Tyr Cys Lys Asn Arg 
50 55 60 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 433 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
ATATCATTCA AATATACTAA ACTAATTATA AAAAATGGCT AAGGTTCCAA TTGCCTTTCT 
CAAATTCGTC ATCGTGTTGA TTCTCTTCAT TGCCATGTCA GGCATGATAG AAGCATGCAT 
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Leu Phe He Ala Met Ser Gly Met He Glu Ala Cys He Gly Asn Gly 
20 25 30 

Gly Arg Cys Asn Glu Asn Val Gly Pro Pro Tyr Cys Cys Ser Gly Phe 
35 AO 45 

Cys Leu Arg Gin Pro Asn Gin Gly Tyr Gly Val Cys Arg Asn Arg 
50 55 60 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
Val Gly Glu Cys Val Arg Gly Arg Cys Pro Ser Gly Met Cys Cys Ser 



A33 



AGGAAATGGA GGAAGATGTA ACGAGAACGT GGGCCCACCA TACTGCTGTT CGGGTTTCTG 180 

CCTCCGTCAA CCTAACCAAG GTTACGGTGT TTGCAGGAAC CGCTAATAAG CAAAGCCCAA 2 AO 

AGTGTGGGTC ACAAAATAGT AGAGTTTAGC CTCAAATGTG GTTTATATAT GTAACAATCT 300 

TATATGTGTT TCTCTTGTGT TTCTTAATTA CCTTCTTTGT GTCTAAGAAG GTATGGATAA 360 

ATAGTTTGTA CTTTACTATT ATGGTTTTTT CTTATATCAA TAAGAGGCTT TAATTAAAAA 420 
AAAAAAAAAA AAA 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 63 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Met Ala Lys Val Pro He Ala Phe Leu Lys Phe Val He Val Leu II 
15 10 15 
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15 10 15 

Gin Phe Gly Tyr Cys Gly Lys Gly Pro Lys Tyr Cys Gly 
20 25 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Val Gly Glu Cys Val Arg Gly Arg Cys Pro Ser Gly Met Cys Cys Ser 
15 10 15 

Gin Phe Gly Tyr Cys Gly Lys Gly Pro Lys Tyr Cys Gly Arg 
20 25 30 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 590 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 



CAAAAAAAAA 


AAATAAAGTC 


AAGAGTATTA 


ATTAGGTGAG 


AAAAAATGGT 


GAACATGAAG 


60 


TGTGTTGCAT 


TGATAGTTAT 


AGTTATGATG 


GCGTTTATGA 


TGGTGGATCC 


atcaatc;gga 


120 


GTGGGAGAAT 


GTGTGAGAGG 


ACGTTGCCCA 


AGTGGGATGT 


GTTGCAGTCA 


GTTTGGGTAC 


180 


TGTGGTAAAG 


GCCCAAAGTA 


CTGTGGCCGT 


GCCAGTACTA 


CTGTGGATCA 


CCAAGCTGAT 


240 


GTTGCTGCCA 


CCAAAACTGC 


CAAGAATCCT 


ACCGATGCTA 


AACTTGCTGG 


TGCTGGTAGT 


300 


CCATGAAAGT 


AGTAGCTAGC 


TAGGTTCACG 


TTGGATTACC 


AAGCCGTGCC 


AGTACTACTG 


360 


TGGCCGTGCC 


AGTACTAATG 


TTCTCTTATA 


TGTCTGAAAT 


AAGCTCCTAT 


ATAAATACTA 


420 


GTATCTTGAT 


GTAATGGAGT 


ATTTTCATTT 


TGTTTTTATT 


TGAGTTATGA 


TCGTGACTTC 


480 
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CTTGTGTTGG TTTAACTTGT ATATTGTAAT GCATCTTAAA TGCTGTCTCA AATAATTTGA 5A0 
TGTATTAAAC ACTTGTTTTG TTTTTAATAC ATACTAAGTG CTGTAAATTC 590 
(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 86 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Met Val Asn Met Lys Cys Val Ala Leu lie Val He Val Met Met Ala 
15 10 15 

Phe Met Met Val Asp Pro Ser Met Gly Val Gly Glu Cys Val Arz Glv 
20 25 30 

Arg Cys Pro Ser Gly Met Cys Cys Ser Gin Phe Gly Tyr Cys Gly Lys 
35 40 45 

Gly Pro Lys Tyr Cys Gly Arg Ala Ser Thr Thr Val Asp His Gin Ala 
50 55 60 

Asp Val Ala Ala Thr Lys Thr Ala Lys Asn Pro Thr Asp Ala Lys Leu 
65 70 75 80 

Ala Gly Ala Gly Ser Pro 
85 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

Gin Glu Gin Cys Gly Asn Gin Ala Gly Gly Arg Ala Cys Ala Asn Arc 
1 5 10 15 ^ 
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Leu Cys Cys Ser Gin Tyr Gly Tyr Cys Gly Ser Thr Arg Ala Tyr Cys 
20 25 30 

Gly Val Gly Cys Gin Ser Asn Cys Gly Arg 
35 40 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 126 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
CAAGAGCAAT GCGGAAACCA AGCTGGAGGA AGAGCTTGCG CTAACAGACT TTGCTGCTCT 60 
CAATACGGAT ACTGCGGATC TACTAGAGCT TACTGCGGAG TTGGATGCCA ATCTAACTGC 120 
GGAAGA 126 
(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(ix) FEATURE: 

(A) NAME/KEY: Modif ied-si te 

(B) LOCATION: 15 

(D) OTHER INFORMATION: /note= "Xaa at position 15 may be R 
or H" 

(ix) FEATURE: 

(A) NAME/KEY: Modi f ied-s i te 

(B) LOCATION; 29 

(D) OTHER INFORMATION: /note= "Xaa at position 29 may be S 
or N" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

Cys Ser Ser His Asn Pro Cys Pro Arg His Gin Cys Cys Ser Xaa Tyr 
15 10 15 
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Gly Tyr Cys Gly Leu Gly Ser Asp Tyr Cys Gly Leu Xaa Cys Arg Gly 
20 25 30 

Gly Pro Cys Asp Arg 
35 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 111 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
TGCTCTTCTC ACAACCCGTG CCCGAGACAC CAATGCTGCT CTAAGTACGG ATACTGCGGA 60 
CTTGGATCTG ACTACTGCGG ACTTGGATGC AGAGGAGGAC CGTGCGACAG A HI 
(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

Gin Lys Leu Cys Glu Arg Pro Ser Gly Thr Trp Ser Gly Val Cys Gly 
15 10 15 

Asn Asn Asn Ala Cys Lys Asn Gin Cys He Asn Leu Glu Lys Ala Arg 
20 25 30 

His Gly Ser Cys Asn Tyr Val Phe Pro Ala His Lys 
35 40 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Gin Lys Leu Cys Gin Arg Pro Ser Gly Thr Trp Ser Gly Val Cys Gly 
1 5 10 15 ^ 

Asn Asn Asn Ala Cys Lys Asn Gin Cys lie Arg Leu Glu Lys Ala Arg 
20 25 30 

His Gly Ser Cys 
35 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

Gin Lys Leu Cys Glu Arg Pro Ser Gly Thr Trp Ser Gly Val Cys 
15 10 15 

Asn Asn Asn Ala Cys Lys Asn Gin Cys lie Asn 
20 25 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

Gin Lys Leu Cys Glu Arg Pro Ser Gly Thr Xaa Ser Gly Val Cys Gly 
15 10 15 
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Asn Asn Asn Ala Cys Lys Asn Gin Cys lie Arg 
20 25 



(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

Gin Lys Leu Cys Glu Arg Pro Ser Gly Thr Trp Ser Gly Val Cys Gly 
15 10 15 

Asn Asn Asn Ala Cys Lys Asn Gin Cys lie Asn Leu Glu Lys 
20 25 30 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 20: 

Gin Lys Leu Cys Glu Arg Pro Ser Gly Thr Trp Ser Gly Val Cys Gly 
15 10 15 

Asn Asn Asn Ala Cys Lys Asn 
20 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

Gin Lys Leu Cys Glu Arg Pro Ser Gly Thr Trp Ser Gly Val Cys Gly 
15 10 15 

Asn Asn Asn Ala Cys Lys Asn Gin Cys 
20 25 

(2) INF0R>1ATI0N FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: 

Gin Lys Leu Cys Gin Arg Pro Ser Gly Thr Trp Ser Gly Val Cys Gly 
1 5 10 15 ^ 

Asn Asn Asn Ala Cys Arg Asn Gin Cys lie 
20 25 

(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

Gin Lys Leu Cys Glu Arg Pro Ser Gly Thr Trp Ser Gly Val Cys Gly 
15 10 15 

Asn Ser Asn Ala Cys Lys Asn Gin Cys lie Asn 
20 25 

(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 amino acids 

(B) TYPE: amino acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:2A: 

Glu Leu Cys Glu Lys Ala Ser Lys Thr Trp Ser Gly Asn Cys Gly Asn 
15 10 15 

Thr Gly His Cys Asp Asn Gin Cys Lys Ser Trp Glu Gly Ala Ala His 
20 25 30 

Gly Ala Cys His Val Arg Asn Gly Lys His Met Cys Phe Cys Tyr Phe 
35 AO A5 

Asn Cys 
50 

(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

Glu Val Cys Glu Lys Ala Ser Lys Thr Trp Ser Gly Asn Cys Gly Asn 
15 10 15 

Thr Gly His Cys 
20 

(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 



SUBSTITUTE SHEET 



PCT/GB94/0001 



Glu Leu Cys Glu Lys Ala Ser Lys Thr Trp Ser Gly Asn Cys Gly Asn 
15 10 15 

Thr Lys His Cys Asp Asp Gin Cys Lys Ser Trp Glu Gly Ala Ala His 
20 25 30 

Gly Ala Cys His Val Arg Asn Gly Lys His Met Cys Phe Cys Tyr Phe 
35 AO 45 

Asn Cys 
50 

(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

Glu Leu Cys Glu Lys Ala Ser Lys Thr Trp Ser Gly Asn Cys Gly Asn 
^ ^ 10 15 

Thr Lys His Cys Asp Asn Lys Cys Lys Ser Trp Glu Gly Ala Ala His 
20 25 30 

Gly Ala Cys His Val Arg Ser Gly Lys His Met Cys Phe Cys Tyr Phe 
35 40 45 

Asn Cys 
50 

(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 
Lys Thr Cys Glu Asn Leu Ser Gly Thr Phe Lys Gly Pro Cys He Pro 
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15 10 15 

Asp Gly Asn Cys Asn Lys His Cys Lys Asn Asn Glu His Leu Leu Ser 
20 25 30 

Gly Arg Cys Arg Asp Asp Phe Xaa Cys Trp Cys Thr Arg Asn Cys 
35 AO 45 

(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 49 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

Asn Leu Cys Glu Arg Ala Ser Leu Thr Trp Thr Gly Asn Cys Gly Asn 
15 10 15 

Thr Gly His Cys Asp Thr Gin Cys Arg Asn Trp Glu Ser Ala Lys His 
20 25 30 

Gly Ala Cys His Lys Arg Gly Asn Trp Lys Cys Phe Cys Tyr Phe Asp 
35 40 45 

Cys 



(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

Ala Leu Ser Cys Gly Thr Val Asn Ser Asn Leu Ala Ala Cys lie Gly 
15 10 15 

Tyr Leu Thr Gin Asn Ala Pro Leu Ala Arg Gly Cys Cys Thr Gly Val 
20 25 30 



SUBSTITUTE SHEET 



PCT/GB94/00012 



29 



Thr Asn Leu Asn Asn Met Ala Xaa Thr Thr Pro 
35 40 

(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 150 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:31: 
GAGCTTTGCG AGAAGGCTTC TAAGACTTGG TCTGGAAACT GCGGAAACAC TGGACATTGC 
GATAACCAAT GCAAGTCTTG GGAGGGAGCT GCTCATGGAG CTTGCCATGT TAGAAACGGA 
AAGCATATGT GCTTCTGCTA CTTCAACTGC 
(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 60 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 
GAGGTTTGCG AGAAGGCTTC TAAGACTTGG TCTGGAAACT GCGGAAACAC TGGACATTGC 



(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 150 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 
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GAGCTTTGCG AGAAGGCTTC TAAGACTTGG TCTGGAAACT GCGGAAACAC TAAGCATTGC 60 

GATGATCAAT GCAAGTCTTG GGAGGGAGCT GCTCATGGAG CTTGCCATGT TAGAAACGGA 120 

AAGCATATGT GCTTCTGCTA CTTCAACTGC 150 
(2) INFORMATION FOR SEQ ID N0:3A: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 150 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 
GAGCTTTGCG AGAAGGCTTC TAAGACTTGG TCTGGAAACT GCGGAAACAC TAAGCATTGC 60 
GATAACAAGT GCAAGTCTTG GGAGGGAGCT GCTCATGGAG CTTGCCATGT TAGATCTGGA 120 
AAGCATATGT GCTTCTGCTA CTTCAACTGC 150 
(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 141 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 
AAGACTTGCG AGAACCTTTC TGGAACTTTC AAGGGACCAT GCATTCCAGA TGGAAACTGC 60 
AACAAGCATT GCAAGAACAA CGAGCATCTT CTTTCTGGAA GATGCAGAGA TGATTTCNNN 120 
TGCTGGTGCA CTAGAAACTG C 1^1 
(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 147 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



SUBSTITUTE SHEET 



(ii) MOLECULE TYPE: cDNA 



31 



PCT/GB94/0001 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 
AACCTTTGCG AGAGAGCTTC TCTTACTTGG ACTGGAAACT GCGGAAACAC TGGACATTGC 
GATACTCAAT GCAGAAACTG GGAGTCTGCT AAGCATGGAG CTTGCCATAA GAGAGGAAAC 
TGGAAGTGCT TCTGCTACTT CGATTGC 
(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 414 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:37: 

GTTTTATTAG TGATCATGGC TAAGTTTGCG TCCATCATCG CACTTCTTTT TGCTGCTCTT 60 

GTTCTTTTTG CTGCTTTCGA AGCACCAACA ATGGTGGAAG CACAGAAGTT GTGCGAAAGG 120 

CCAAGTGGGA CATGGTCAGG AGTCTGTGGA AACAATAACG CATGCAAGAA TCAGTGCATT 180 

AACCTTGAGA AAGCACGACA TGGATCTTGC AACTATGTCT TCCCAGCTCA CAAGTGTATC 240 

TGCTACTTTC CTTGTTAATT TATCGCAAAC TCTTTGGTGA ATAGTTTTTA TGTAATTTAC 300 

ACAAAATAAG TCAGTGTCAC TATCCATGAG TGATTTTAAG ACATGTACCA GATATGTTAT 360 

GTTGGTTCGG TTATACAAAT AAAGTTTTAT TCACCAAAAA AAAAAAAAAA AAAA 41A 
(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 80 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 
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Met Ala Lys Phe Ala Ser lie lie 
1 5 



Ala 



Leu Leu Phe Ala Ala Leu Val 
10 15 



Leu Phe Ala Ala Phe Glu Ala Pro 
20 



Thr 
25 



Met Val Glu Ala Gin Lys Leu 
30 



Cys Glu Arg Pro Ser Gly Thr Trp 
35 AO 



Ser 



Gly Val Cys Gly Asn Asn Asn 
45 



Ala Cys Lys Asn Gin Cys lie Asn 



Leu 



Glu Lys Ala Arg His Gly Ser 



50 55 



60 



Cys Asn Tyr Val Phe Pro Ala His 
65 70 



Lys 



Cys lie Cys Tyr Phe Pro Cys 
75 80 



(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 284 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

GGAAATAATA ACGCATGCAA GAATCAGTGC ATTCGACTTG AGAAAGCACG ACATGGGTCT 60 

TGCAACTATG TCTTCCCAGC TCACAAGTGT ATCTGTTATT TCCCTTGTTA ATTCCATAAA 120 

CTCTTCGGTG GTTAATAGTG TGCGCATATT ACATATAATT AATAAGTTTG TGTCACTATT 180 

TATTAGTGAC TTTATGACAT GTGCCAGGTA TGTTTATGTT GGGTTGGTTG TAATATAAAA 240 

AAGTTCACGG ATAATAAGAT GATAAGCTCA CGTCGCCAAA AAAA 284 
(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 
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Gly Asn Asn Asn Ala Cys Lys Asn Gin Cys He Arg Leu Glu Lys Ala 
15 10 15 

Arg His Gly Ser Cys Asn Tyr Val Phe Pro Ala His Lys Cys He Cys 
20 25 30 

Tyr Phe Pro Cys 
35 

(2) INFORMATION FOR SEQ ID N0:A1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 288 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 

CCCCGGGCTG CAGGAATTCG CGGCCGCGTT TTATTAGTGA TCATGGCTAA GTTTGCGTCC 60 

ATCATCGCAC TTCTTTTTGC TGCTCTTGTT CTTTTTGCTG CTTTCGAAGC ACCAACAATG 120 

GTGGAAGCAC AGAAGTTGTG CCAAAGGCCA AGTGGGACAT GGTCAGGAGT CTGTGGAAAC 180 

AATAACGCAT GCAAGAATCA GTGCATTAGA CTTGAGAAAG CACGACATGG ATCTTGCAAC 240 

TATGTCTTCC CAGCTCACAA GTGTATCTGC TACTTTCCTT GTTAATAG 288 
(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 80 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

Met Ala Lys Phe Ala Ser He He Ala Leu Leu Phe Ala Ala Leu Val 
15 10 15 

Leu Phe Ala Ala Phe Glu Ala Pro Thr Met Val Glu Ala Gin Lys Leu 
20 25 30 

Cys Gin Arg Pro Ser Gly Thr Trp Ser Gly Val Cys Gly Asn Asn Asn 
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35 AO 

Ala Cys Lys Asn Gin Cys He Arg 
50 55 

Cys Asn Tyr Val Phe Pro Ala His 
65 70 



A5 

Leu Glu Lys Ala Arg His Gly Ser 
60 

Lys Cys He Cys Tyr Phe Pro Cys 
75 80 
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We claim; 



A 



method of producing an antiinicrobial- 
protein-producing micro-organism capable of 
entering into an endosymbi oti c relationship 
with a plant host comprising the combination 
of genetic material encoding a plant-derived 
antimicrobial protein with an endophyte. 

A method according to claim 1 in which the 
plant-derived antimicrobial protein is 
selected from the protein group consisting of 
Mj-AMPl, Mj-AMP2, Ac-AMPl , AC-AMP2, Ca-AMPl 
Bm-AMPl, Rs-AFPl, Rs-AFP2 , Br-AFPl, Br-AFP2' 
Bn-AFPl, Bn-AFP2, Sa-AFPl , Sa-AFP2 , At-AFPl ' 
Dm-AMPl, Dm-AMP2, Cb-AMPl , Cb-AMP2 , Lc-AFP ' 
Ct-AMPl, Ct-AMP2 and Rs-nsLTP. 

A method according to claim 1 in which the 
endopyte is Clavibacter xyli subsp. 
cynodonti s . 

An antimicrobial-protein-producing 
micro-organism produced by the method 
according to claim 1. 

A method for protecting a plant host from 
disease comprising treating the plant host 
with the antimicrobial-protein-producing 
micro-organism according to claim 4. 

A plant or seed treated with an 

antimicrobial-protein-producing micro-organism 
according to claim 4. 



